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1.0 INTRODUCTION 

This report summarizes the System Architecture Study of the Sensor Data 
Validation and Reconstruction Task of the Development of Life Prediction Capabilites For 
Liquid Propellant Rocket Engines Program, NAS 3-25883. The effort to develop 
reusable rocket engine health monitoring systems has made apparent the need for life 
prediction techniques for various engine systems, components, and subcomponents. 
The design of reusable space propulsion systems is such that many critical components 
are subject to extreme fluctuations causing limited life, which is not adequately explained 
by current techniques. Therefore, the need exists to develop advanced life prediction 
techniques. In order to develop a reliable rocket engine condition monitoring system, 
erroneous transducer data must be identified and segregated from valid data. 
Erroneous sensor data may result from either (1) "hard" failures which are typically large 
in magnitude and occur rapidly or (2) "soft" failures which are typically small in magnitude 
and occur slowly with time. The underlying causes of such failures can include physical 
damage (e.g. wire or diaphragm breakage), calibration/software errors, or thermal drift. 
The objective of this task has been to develop a methodology for using proven analytical 
and numerical techniques to screen the SSME CADS and facility data sets for invalid 
sensor data and to provide signal reconstruction capability. This methodology is 
structured to be an element of an overall Engine Diagnostic System [1]. 

The approach taken to develop this methodology has been to evaluate sensor 
failure detection and isolation (FDI) and signal reconstruction techniques relative to the 
problem of SSME sensor data validation. From this evaluation, applicable techniques 
have been identified and an overall computational strategy has been developed to 
provide automated FDI and signal reconstruction capability. The overall computational 
strategy is based on the use of an advanced data synthesis technique which is capable 
of combining the results of several different test of sensor validity (such as limit checks, 
hardware redundancy, sensor reliability data, and predictive models). The output of this 
task is a software specification for this strategy and a software implementation plan. 
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2,Q EXECUTIVE SUMMARY 

The current SSME data validation procedure at NASA MSFC is based on a 
manual review of test data by expert analysts. To date this system has worked well, but 
is an inefficient use of the valuable time of the expert, who must visually Inspect each 
data plot looking for anomalous data. One of the key elements of the Engine Diagnostic 
System currently under development by NASA, is to exploit recent advances in 
computational and graphics performance of modem RISC type work stations and 
advanced computational techniques to automate, streamline, and improve the rocket 
engine diagnostic procedure. The System Achitecture Study described in this report has 
addressed this issue for the problem of sensor data validation. Verifying test data is 
essential prior to doing performance calculations or engine health assessments. 

The System Architecture Study has consisted of (1) a review of the current SSME 
data validation process at MSFC, (2) selection of key SSME CADS and facility data set 
parameters for which automated data validation and reconstruction is desirable, (3) 
review and selection of potential techniques for parameter fault detection and 
reconstruction, and (4) development of a computational scheme incorporating the 
techniques. Based on the work conducted in this phase of the program a software 
specification of the Sensor Data Validation and Reconstruction System (SDV&RS) has 
been developed and is described in detail in Section 4.0. The recommended 
development plan for implementation of the software specification is described in Section 
5.0. 

A wide range of sensor failure modes exists for the SSME digital data sets. Table 
1 lists some of the causes and effects of several known modes documented in the SSME 
Failure Modes and Effects Analysis [2] and documented in the UCR (Unsatisfactory 
Condition Report) database. The resultant transducer signals range from hard-open, 
shorted, noisy, intermittent, to slight drifts and shifts. In order detect, isolate, and 
reconstruct signals resulting from this wide range of failure modes several potential 
validation techniques have been reviewed and evaluated. Table 2 summarizes the 
techniques which have been examined for use in the SDV&SR system. 

No single fault detection scheme appears solely capable of accurately detecting 
and reconstructing all of the important SSME sensor malfunctions. Each technique 
provides some evidence regarding sensor failure, and different techniques work best for 
different failure modes. The overall conclusion of this study is that the best approach for 
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Table 1. A Wide Range Of SSME Failure Modes Exist 
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Table 2. SSME Data Validation Techniques Reviewed 
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robust and complete sensor data validation is to use several methods and fuse their 
individual results into a single pass/fail decision for every sensor at every time slice in the 
test. 

Information fusion techniques provide explicit representation of and accounting 
for the uncertainties in the sensors and in the various fault detection schemes. Of the 
various techniques for performing information fusion, Belief Networks have been 
determined to be the most appropriate for the advanced liquid rocket engines such as 
the SSME. 

The overall SDV&SR system as currently specified is illustrated in Figure 1. The 
system will run in two major stages; initial batch processing mode, followed by an 
interactive post processing mode. In the batch mode, the SSME test data (in 
engineering units) is thoroughly analyzed by the sensor validation system, with PID 
failure detection and PID value reconstruction performed automatically and stored in a 
separate data file. The batch mode process will be completed overnight following a test 
and the results will be available to the analyst at the start of the day. The purpose of the 
interactive mode is to allow analysts to quickly review and understand the results of the 
batch mode processing and either confirm or override the failure and reconstruction 
decisions made by the sensor validation system. 
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3.0 TECHNICAL DISCUSSION 

The following section describes the technical progress accomplished during the 
Phase 1 of the task. The principle sections consist of (1) review of SSME test data and 
current validation procedure, (2) evaluation of fault detection and signal reconstruction 
techniques, and (3) review of information synthesis techniques for combining tests of 
sensor validity. 

3.1 Review of SSME Test Data and Validation Procedure 

3.1.1 SSME CADS and Facility Digital Data 

The SSME CADS and facility data sets consist of approximately 130 and 280 
individual parameters respectively. Each parameter is identified with a unique number 
code call a parameter identification number (PID). The various PIDs consist of 
transducer signals, calculated parameters, and controller signals. Of the approximately 
410 PIDs, there are approximately 325 actual sensor signals. A complete PID list is 
provided in Appendix A. 

A review of CADS and facility sensors for which automated data validation and 
reconstruction would be desirable has been conducted. The list sensors for which 
validation and signal reconstruction selected is given in Table 3. A total of 115 
transducers have been selected. The criterion used to select a sensor for validation 
were, in order of importance: 

(1) Is the sensor an engine control parameter? 

(2) Is the sensor an engine redline parameter? 

(3) Is the parameter plotted for pre or post-test reviews? 

(4) Is the sensor used in the steady state engine power balance model? 

(5) Additional sensors providing redundancy or correlation to sensors in (1) 
through (4). 

Categories (1) and (2) comprise the most critical sensors in the engine. Sensors 
which fall under category (3) are assumed to be important for diagnosing engine health 
since they are routinely examined by the MSFC SSME data analysts. It is expected that 
engine diagnostics elements of the Engine Diagnostic System (EDS) would use the 
same set of inputs. The sensors falling under category (4) should be validated since 
they are used in the performance model calculate specific impulse of the engine. 
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Table 3. SSME Sensors Selected For Sensor Validation 


(measurement 


[ MEASjPID 
SET 


UNITS 


SATA 

LOW 


DATA 

HIGH 


blSCRIPTION 


MCC COOLANT DISCH PRESS CH A1 
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The specific breakdown of sensors for which data validation and reconstruction is 
recommended is: 

Engine Control Parameters 13 PIDs 
Flight Red Line Sensors 16 PIDs 

Test Red Line Sensors 23 PIDs 

Sensors Used For Engine Diagnostic 39 PIDs 
Sensors Used In Power Balance Model 23 PIDs 

3.1.2 Current SSME Data Validation Procedure 

The current SSME data validation procedure at NASA MSFC was reviewed to (1) 
determine the performance requirements (turn-around time, accuracy, etc.) of an 
automated data validation and reconstruction system from a user stand point, (2) assess 
the techniques currently employed for data validation, and (3) obtain first-hand 
knowledge of the characteristics of the SSME data. Interviews were conducted with 
NASA and Martin Marietta personnel directly involved in day-to-day evaluation of test 
data. Summaries of the specific interviews are contained in Appendix A. 

Sensor data validation is the responsibility of Martin Marietta data analysts 
employed at NASA MSFC. Data validation is performed as part of their overall 
responsibility for assessing the health of particular engines. The current MSFC data 
validation process is illustrated in Figure 3. The elements of the process described 
below. A detailed description of the process is contained in Reference [1]. 

1.0 Following an SSME test firing at NASA Stennis, the raw test data 
(voltages) are converted to engineering units using transducer calibration 
data. The data is transferred to NASA MSFC and down loaded to Perkin 
Elmer 4 computer system. 

2.0 A standard set of plots is prepared and is available to the data analyst by 
8:00 am following the day of a test. Included in these plot packages are 
data from previous test firings which have been requested by the data 
analysts. These previous data are chosen from the most recent tests 
which involved either the (1) the same engine, (2) the same power-head 
set, and (3) preferably the same test stand (A1, A2, or B2). 
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Figure 3. Current NASA MSFC Data Review Process 
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3.0 


4.0 


5.0 


The test data packages and data files are simultaneously revl ® w 
groups. The first group is the performance analysis group w ic u 
test data as Input to the SSME steady state power balance model. 

Concurrent with the power balance analysis the Systems Analysis Group 
(typically one lead analyst and one support analyst) performs a manua 
'eview of the data and use existing data validation codes to screen the 
data. The group currently uses two FORTRAN computer codes to screen 

data to detect faulty data. 

Two Sigma Comparison Code: 

Spike Detection and Shift Code: 

Comparison of results are made between the two groups to identify 
potential faulty data. By the conclusion of process 5.0 (epical y an eig 
hour shift), all anomalies detected in the data are attributed to engine 

behavior or transducer malfunction. 

60/7 0 The results of the test data analysis process (3.0 , 4.0, and 5.0) are 
presented in the post test review. Instrumentation action items are flagged 

for the next pretest review. 

Sensor data validation occurs in steps 3.0 and 4.0. As indicated on the process 
Bow diagram results from the power balance calculation and the manual data review 
shared It is not uncommon for the initial run of the power balance model to produce 
anomalous results (typically a noticeable change in calculated specific impuse). 

detailed inspection of the input PIDS and intermediates^ ^failures 

sensors are identified and excluded from the input deck of the model. Soft f 
oresent a particular problem to the power balance model because their magnitude 
often not large enough to violate currently employed limit checking procedures, but can 

sianificantly impact calculation of key performance parameters. 

When a sensor is suspected of a failure, a -confirmation- procedure is used o 
confirm failure. The experienced data analyst will look at the following evidence 
determine a sensors validity. 
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Pre and post test values Indicate if the transducer was scaled, calibrated, 
or installed improperly. Additionaly, if a sensor does assume a norma! 
steady post test value, then transient effects such as thermal drifts may 
have caused the failure. 

Related sensors (such as upstream and downstream pressures and 
temperatures) are inspected to see if they agree with the failed sensor. 

The signal is compared to previous measurements made with the same 
engine components. 


Several aspects of the current NASA MSFC data validation procedure have been 
adopted in the Sensor Data Validation and Signal Reconstruction System described in 
section 4.0. These key features are: 

1. integration of many sources of information for determining sensor 
validity (see Section 3.3); 

2. use of calculated engine system parameters to indicate an 
inconsistent sensor reading (see Section 3.2 on characteristic 
equations); 

3. comparison of data patterns to known "nominal" patterns. 

3.1.3 SSME Sensor Failure Modes 

A wide range of sensor failure modes exist for the SSME digital data sets as 
summarized in Table 1. The general requirements of the senser data validation and 
reconstruction system to detect these diffent types of failure modes is described in the 
Systems Users Requirements Summary Report in Appendix A. During the task, data 
from 20 recent SSME test firing was reviewed to identify common failure modes Table 4 
lists 22 sensors which were documented as failed in the 20 tests reviewed. Of the 22 
sensors, 13 failed only once in the 20 tests examined and a few sensors such as the Fuel 
Preburner Chamber Pressure, PID 158, which has history of thermal drift, failed on 85% 
of the tests. Extension of the sensor failure frequency data to a larger number of tests 
will allow a more comprehensive database of sensor reliability to be constructed. The 

use of such data can be incorporated into the SDV&SR data system as described in 

Section 3.3. 
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Figure 4 illustrates four common failure signatures of the SSME pressure and flow 
transducers. Thermal drift of the FPB Pc and the OPB Pc are common failure modes 
due to their installation on the SSME. The characteristic behavior is for the signal to 
appear normal during startup, but during mainstage to begin to decay due to icing. 
Intermittent or "hashy" signals are typically due to poor electrical connections as noted in 
Table 1 . These signals generally appear normal except for large spikes off scale either 
over or under the scaled range. Fuel turbine flow meters and pump speed transducers 
can exhibit signal aliasing such that false signal fluctuations appear in the data. During 
power level transitions many pressure and temperature transducers experience over and 
under shoot causing their data to be invalid during a brief period of time while recovery 
occurs. This type of behavior is considered a sensor failure because the data is not 
valid, even though there is not a problem with the transducer. Figure 5 shows some of 
the documented SSME sensor failures. 
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Table 4 Documented Sensor Failures Indicate Reliability 
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Source: SSME Test Review Summaries 
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Figure 4. Typical SSME Sensor Failure Modes 
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3.2 Evaluation of Fault Detection and Signal Reconstruction Techniques 

Table 2 summarizes the various fault detection and isolation techniques which 
were reviewed during the task. As noted in the table, applicable literature for the various 
techniques has been reviewed and in some cases demonstrated with available SSME 
test data at Aerojet. A description of the applicability and limitations of each of these 
techniques as applied to SSME data is given below. 

3.2.1 Statistical Comparison Techniques 

Statistical comparison techniques cover the class of techniques where the signal 
value or statistics of the transducer signal are compared to known "acceptable" values. 
Three of these techniques which are suitable for SSME data validation are described 
below. The first technique is limit checking tests which serves as a basic indicator if a 
transducer signal is within the expected envelope of "nominal" operation. The second 
technique discussed is an extreme value exceedance test which can differentiate true 
signal behavior from spurious spikes in the data. The third statistical comparison test is 
a moving average test which indicates if a significant trend exists in the data. 

L imi t Che cking 

Comparison of a signal to predefined limits constitutes the simplest form of 
sensor data validation [3,4]. If the signal exceeds the limit it is considered "out of family" 
and indicates either an instrumentation error or an engine component failure. Common 
limits used in data validation schemes are: 

1 . High and low data ranges of the transducer 

2. Two or three standard deviation variation from the mean 

3. Comparison of signal statistical values to "family" averaged values. 

Limit checking is for sensor failure detection is limited to severe hard failures. In 
order to detect soft failures such as drifts, simple limits must be set so tight that an 
unacceptably high number of false detections occur. Currently the SSME test data is 
compared to the 'Two Sigma" database as part of the data analysis. Figure 6 shows 
some typical data plot with the two sigma limits indicated. 

A table of the 'Two Sigma" database is given in Appendix C. In order to rationally 
compare different tests, the average values of parameters are taken at (1) the maximum 
fuel turbine temperature, (2) the maximum oxidizer turbine temperature, and (3) at the 
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nominal 104% LOX vent. Figure 7 shows a typical thrust profile and the locations of each 
of the conditions listed above. 

The "sigma" bands reflect a wide variation in "nominal" operating conditions for 
some parameters. This variability is due to the averaging effect obtained by using data 
from various engines and test stands when populating a statistical database. While 
these statistically based techniques are relatively effective in detecting signals that have a 
large drift or a low signal to noise ratio, it is possible that a noisy signal can lie within the 
"sigma" band and go undetected. This is shown graphically in Figure 3. 

Data Spikes Detection 

Data spikes are sharp and significant changes in data, not attributable to 
measurable physical phenomena, but due to a instrumentation anomaly, such as a 
malfunctioning A/D converter. Removal of spurious spikes from measured data is 
necessary to improve quality of the data and therefore any conclusions drawn from the 
data. 

One approach to identifying such spurious signals is with extreme value 
probability theory [5]. Extreme value theory is concerned with the probability distribution 
of the extreme value of a sample of n independent observations of a random variable. 
Given this extreme value probability distribution, a detection limit can be established with 
an arbitrarily low probability of exceedance for the largest value of n independent 
observations. 

The theoretically exact distribution of the extreme largest value (Y) from n 
independent observations of a random variable (X) is defined in terms of cumulative 
distribution functions: 

Fy(v) - [F x (y)] n 

The particular value of Y=y corresponding to a cumulative probability p can then 
be determined from the distribution of X as follows: 

[F x (y)] n = p 
F x (y) = P 1/n 
y = F/ 1 (pVn ) 


Thus y is determined directly from the inverse cumulative distribution function of X. 
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For example, let n = 10, OCX) be independent observations of a normal variate X 
and let p = 0.99 define the arbitrarily established detection limit. Then this detection limit 
(y) is calculated as follows in terms of standard deviations from the mean. 

y = Fx' 1 (p 1 /" ) = F/ 1 (0.999999) = 4.76 

The inverse of the normal cumulative distribution function is conveniently tabulated. 

Therefore if the detection limit is set at 4.76 standard deviations above the mean 
of a normal distribution and if 10,000 independent observations are made of this normal 
random variate, then the single largest of these observations will be expected to exceed 
this detection limit, on the average, one percent of the time. More realistically, many 
sensors which are prone to spiked data will usually have an exceedance probability 
much greater than one percent. This higher probability can be determined by 
examination of previously obtained data, and reflected by lowering P to a more realistic 
value. Such application of extreme value theory can be used to define a detection limit, 
the exceedance of which may reasonably be assumed to constitute a spurious data 
spike. 


Moving Averages 

As discussed in Section 3.1.3, soft failures usually manifest themselves as slowly 
changing drifts typical of thermally sensitive failure modes. As seen in Figure 6, varying 
amounts of dynamic fluctuations of the signals about their mean values occur during 
steady power level operation of the SSME. The sources of these fluctuations are (1) 
quantization error during A/D conversion, (2) electrical noise induced by mechanical 
vibration of engine, and (3) dynamic excursions of the engine resulting from the closed 
loop control logic of the engine. In order to extract the true trend data signal a simple 
moving average of the data can be calculated [6,7]. Some SSME failure detection 
algorithms such as the System For Anomaly and Failure Detection (SAFD) [8] algorithms 
are based on monitoring the moving average of many parameters. Figure 9 illustrates 
the smoothing effect of a noisy signal by applying a moving average calculation. The 
simplest moving average can be defined as: 


Vi =1 Yj/N 
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where: N = a fixed number of previous points (25 for a 1 second 

average of CADS data) 

Yj = sensor reading at a given time slice 

The moving average smooths the signal and reveals the true trend of the data. 
While the overall trend is apparent to a trained expert, evaluation of the time derivative 
from the raw signal at a given time slice can yield a meaningless result (e.g. a positive 

value when the true trend has a negative slope). 

The moving average computation provides a good means of extracting trend data 
from signals. For sensor fault detection the signal trend is insufficient to identify a bad 
signal. The trends (transients) in sensor values of the SSME can be caused by factors 
other than power level change, such as (1) engine component anomalies, (2) propellant 
transfer which causes changes in propellant inlet temperatures, and (3) propellant tank 
venting and repressurization, which causes pump inlet pressure changes. 

3.2.2 Analytical Redundancy Techniques 

Analytical redundancy for sensor data validation consists of three parts, (1) 
parameter estimation, (2) parameter fault detection, and (3) fault isolation [10, 11, 12]. 
The principle advantage of analytical redundancy techniques over the statistical 
comparison techniques discussed above is that the parameter estimation model 
provides a means of signal reconstruction which is a key element of the SDV&SR 
system. 

The major uncertainties regarding the development of an analytical redundancy 
capability for the SSME sensors have been addressed. These issues are the following: 

1 . How many of the Sensor PIDs of interest can be can be modeled as linear 
or nonlinear combinations of other parameters? 

2. Is the accuracy of these models sufficient to enable reasonable fault 
detection? 

3. Can a robust fault isolation methodology be developed for the resulting 
models? 

Issues (1) and (2) above involve a basic tradeoff model complexity (i.e. number of 
terms in equations and form of model) and the accuracy of the estimate, as illustrated in 
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Figure 10. Two schemes which represent the limits of this tradeoff are (1) the "dedicated 
observer scheme" (DOS) [13] and (2) the "generalized observer scheme" (GOS) [14] 
shown Figure 11. The DOS technique performs sensor fault detection by assigning a 
dedicated estimator to each of the sensors. Each estimator (e.g. a least squares fit of 
another sensor output to previous test data or an ARMA model [15]) is driven by only a 
single sensor output The output of the estimator, Y\ is then compared to the sensor 
measurement, Y, to produce a residual r. The residual is then compared to a threshold 
limit, 0, to determine if a fault has occurred. 

The "generalized observer scheme" is similar to DOS except that it is constructed 
such that the estimator is driven by all the output sensors except that of the respective 
sensor. In theory, the "generalized observer scheme" provides the most accurate 
estimate (for a given class of estimators, such as linear regression models) of sensor 
output and therefore the best fault detection because it makes use of all available 
information in the system. The obvious draw back of the GOS approach is that fault 
isolation becomes difficult since a single point failure may cause failure of many 
estimators. On the other hand, the "dedicated observer scheme" can easily 
accommodate single point and most multi-point failure instances provided the large 
number of different PIDs used as the independent variables is approximately the same 
as the number of equations. 

Parameter Estimation 

Two approaches were investigated to generate estimator models for each of the 
SSME CADS and facility parameters specified for validation and reconstruction. The first 
approach was the use of engine characteristic equations which physically relate 
parameters. The second approach was to generate empirical regression equations 
based on existing SSME test data. Each of these techniques is discussed below. 

Engine Characteristics 

Engine characteristics are parameters which describe the performance of a 
particular engine and its components (Table 5 shows some examples of engine 
characteristics). The set of characteristics for an engine form a "fingerprint" which 
describes the engine’s idiosyncrasies relative to all other engines in the same family 
tested thus far. 
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Figure 10. Increasing The Complexity Of Estimator Models Improves Model 
Accuracy 
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Figure 11. Schemes For Generating Estimates of Sensors Based on Input 
From Other Sensors 


29 











NAS 3-25883 


Table 5. Typical Liquid Rocket Engine Characteristics 

Line Resistance AP/(Density Flow 
Pump Affinity Flow/Speed 

Pump Affinity Head /Speech 

C* PC A t/Flow 


Characteristics can be computed for every engine and then entered into a 
database for comparison with the characteristics from all engines in the family. Any 
characteristic which is "out of family" (the largest or smallest value seen, or close to it) 
warrants investigation. Armed with the equations for computing characteristics, and the 
assumption that only one sensor or component can fail at a given time, analysts can 
quickly narrow in on the sources of anomalies. In the example shown in Figure 12, three 
line resistances are calculated given readings from three pressure sensors, a 
temperature sensor (for computing specific gravity), and a flow sensor. In this example 
the two partial resistances R1-2 and R2-3 are out-of-family, while the overall resistance 
R1-3 is normal. The only explanation for this, assuming a single-point failure, is that 
pressure sensor P2 has failed (i.e., biased high). Had all three resistances been out-of- 
family in the same direction (i.e., high or low) then either the temperature or flow sensors 
would be suspect. 

Engine characteristics provide relatively invariant relationships among small sets 
of sensors, thus they are good predictors for use in sensor validation. One approach to 
using characteristics for sensor validation is the following: 


1. Sample a small segment of data for the engine under test and compute all 
characteristics. 

2. If any characteristic is out-of-family, then suspect all sensors involved in its 
calculation (i.e., there was a possible sensor failure in the initial sample data). 

3. The engine’s characteristics are computed for each time slice and compared to 
the sampled characteristics. If the residual between any sampled and computed 
characteristic is larger than a threshold (say 2 sigma), then the sensors involved 
in the calculation are suspect. 


(See Section 3.3 for a discussion of how these "suspicions" can be integrated into a final 
decision regarding sensor failure.) 
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To evaluate the approach outlined above for use on the SSME data, 15 
characteristics relating 19 non-redundant PIDs were derived and tested to see how well 
they would perform as predictors. 

To derive the characteristics, SSME flow diagrams were annotated with available 
PIDs, and the characteristic relations were then encoded through analysis of these 
diagrams. Figure 13 and 16 show examples of annotated flow diagrams along with its 
derived characteristics, and Table 6 shows the complete list of characteristics evaluated. 

To evaluate whether family-averaged characteristics could be used as predictors, 
characteristics were computed for three test data sets at 109% steady-state power levels 
(A2492, A2493, and A2495). Characteristics were averaged over these three runs and 
then used as predictors for a fourth test data set (A2497) at its 109% steady-state power 
level. The results of this experiment are shown in columns 3 and 4 of Table 7. 

As discussed above, an alternative approach to family-averaged characteristics is 
to take a small sample of data from the test set, compute engine-specific characteristics 
from this sample, and then use these characteristics as predictors. A second 
experiment was performed to evaluate this approach. A sample of the 109% steady- 
state data was taken (10 seconds of test A2497) and then used to predict PID values for 
the remainder of the 109% steady-state data. The results of this experiment are shown 
in columns 5 and 6 of Table 7. 

In almost every PID prediction in the two experiments the sampled characteristic 
performed significantly better as a predictor than the family-averaged one (i.e., the 
residuals-the difference between the sensed value and the predicted value-were larger 
for averaged characteristics than for sampled ones). This can also be seen graphically 
from plots of sensed vs. predicted PID values. Figure 15 shows a prediction and 
residual for PID 1205 using a family-averaged characteristic (LPFP Q/N). Figure 16 
shows the same prediction using a sampled, engine-specific characteristic. 

A final test was conducted to determine how well characteristic-based predictions 
would perform on transient test data. The characteristics sampled at the 109% power 
level in the previous experiment (for test A2497) were used to predict PID values during 
the first 30 seconds of the same test. Unfortunately, only 5 of the 42 predictions 
performed well enough to be usable during transient conditions. Figure 17 shows a 
typical prediction whose residual is too large during the transient conditions to make it 
usable for sensor validation. 

The characteristic model development work is contained in Appendix B. 
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Figure 13. Example of Flow Diagram and Derived Characteristics for Low 
Pressure Oxygen Pump/Turbine 
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Figure 14. Example of Flow Diagram and Derived Characteristics For HPOP 
Discharge Circuit 



Table 6 Characteristic PID Relations Evaluated 


Characteristic Equation PID Relation 


Pump Flow/Speed 

Q/N > Constant 

PI 212/P30 - Constant 
P1205/P32 - Constant 
P133/P260 - Constant 

LPOP Q/N 
LPFP Q/N 
HPFP Q/N 

Pump AHead/Speed A 2 

AP/N a 2 - Constant 

- 

(P209-P860)/P30 A 2 - Constant 
(P203-P819)/P32 A 2 - Constant 
<P52-P133VP260 A 2 - Constant 

LPOP H/N2 
LPFP H/N2 
HPFP H/N2 

Line Resistance 

aP/Q a 2 - Constant 

(P90-P395)/P1212 A 2 - Constant 
(P52-P129)/P133 A 2 - Constant 
(P52-P1 7)/P1 33 A 2 - Constant 
(P17-P436)/P133 A 2 - Constant 
(P52-P436)/P133 A 2 - Constant 
(P59-P58)/P1212 A 2 - Constant 
(PP59-P480)/P1212 A 2 - Constant 
(P209-P90)/P1212 A 2 - Constant 
(P209-P395)/P1 21 2 A 2 - Constant 

HPOP R1 
MCR1 
MC R2 
MC 43 
MC R4 
PRE R1 
PRE R2 
HPOP R2 
HPOP R3 
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Results of Steady-State Characteristic Experiments 
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Empirical Re gression Equations 

The principle advantages of linear models are that they can usually be solved by a single 
matrix inversion and are relatively straightforward to derive. Linear regression models 
have been successfully employed on the Advanced Propulsion Monitoring System 
Program [3] to detect sensor failures in jet engines. Linear models of dynamic systems 
can display poor accuracy when applied over a wide range of dynamic response, 
however during steady state operation of the SSME, linear models appear to work well in 
tracking the relatively small amplitude of dynamic perturbations of the engine. A typical 
SSME thrust profile is comprised of over 90% commanded steady state operation. 

The procedure being followed for developing the regression equations is shown 
in Figure 18. "Nominal" test data sets were partitioned into startup, shutdown, 65%, 
100%, 104% and 109% power levels. The SSME test summaries for the data sets on 
hand at Aerojet were reviewed for known sensor failures (summarized in Table 4) and 
excluded from the partitioned sets. The steady state data has been further screened to 

isolate the data sets during LOX venting, repressurization, and propellant transfer 
operations. 

Using the partitioned data sets, the one to one correlations between all the 
sensors in the CADS data set have been determined using the Matlab software on the 
GFE Sun workstation [16]. As expected excellent correlation (correlation coefficient 
greater than 0.95) was found between redundant sensor channels and some reasonable 
correlation (correlation coefficient greater than 0.5) was found between over half of the 
sensors. The sole fact of a high correlation coefficient is not sufficient to guarantee that 
a true and significant physical correlation exists between signals. These sensors have 
been down selected based on physical reasonableness determined by subjective 
reasoning regarding physical interactions of the SSME. A summary of the correlation 
coefficients is included in Appendix C. 

From analysis of the correlation coefficients, 42 potential linear regression 
relationships were identified and evaluated versus SSME data. Test were conducted 
similar to those described above for the characteristic equations. First, the coefficients 
of the regression models were developed using “family" data derived from four different 
tests at the same power level. Second, the coefficients were evaluated using a small 
sampling of data from the beginning of the specific test. The results of these 
experiments are summarized below. Typical test results are included in Appendix B. 
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Two typical residuals (true signal minus the predicted) generated using "family" 
data are shown in Figure 19. In the first example the Heat Exchanger Exit Pressure (PID 
34) is calculated as a function of the POGO Precharge Pressure, HPOPT Intermediate 
Purge Pressure, and the HPOPT Secondary Seal Cavity Pressure. The equation was 
derived from 109% power level data from test A1-619 and tested on 109% power level 
data from test A1-620. The residual shows a slight offset from zero which is a common 
feature of many of the equations tested. This offset is approximately 10% of the 
documented SSME sigma for the parameter. The second example is for the POGO 
Precharge Pressure (PID 221) as a function of how many other parameters which 
includes its redundant channel. As might be expected, the inclusion of a redundant 
channel produces very good correlations with little shift from zero. Equations such as 
that in first example appear promising for providing sensor predictions considering no 
redundant parameters were included. Equations which include redundant channels 
such as the second example are heavily weighted by the redundant channel and will 
work for data validation and reconstruction only when the sensor failure mode is such 
that the loss of one sensor channel does not influence the other sensor channel (e.g. 
poor cable connection). These results often show a fixed offset of the signal 
representative of the variance of the particular engine to the family. This offset can not 
be predicted a priori and may trigger false alarms in a simple fault tree logic isolation 
scheme. 

The use of engine specific data to generate the regression equations yielded 
more accurate models. The regression coefficients were derived using a 2 second (50 
data points) time slice at the beginning of a given steady state portion of the test. Using 
sampled data from the beginning of the steady state time slice virtually eliminates the 
offset because the coefficients of the equations are calibrated for the particular engine. 

Of the empirical equations evaluated, 28 equations yielded a standard deviation less 
than that of each of the variables and significantly less than the family two sigma 
database. Figure 20 shows two typical examples of the linear regression equations 
derived by sampling the power level. 

Conclusions Of Parameter Estimation 

Engine characteristic and empirical equations provide a good source of analytical 
redundancy. Although only a small set of all potential relations have been evaluated, a 
larger set covering most of the PIDs on the SSME should be derivable. Table 8 
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summarizes the CADS PIDs (based on the sensor set selected for validation) for which 
characteristic and empirical equations have been developed. In addition, PIDs for which 
hardware redundant channels are available are also indicated. In summary: 

Characteristics and empirical equations computed from samples of data from the 
engine being evaluated work better than family-averaged ones. 

Sampled equations can be checked for reasonableness by comparison to a 
database of family characteristics. 

The characteristic and empirical equations work better during steady state 
operation than during engine transients. 

Fault Detection 

Once a set of equations (empirical or characteristic) have been developed for the 
sensors, fault detection is accomplished by comparing the true signal to the prediction. 
If the resulting difference (the residual) exceeds a predefined threshold value then a 
failure of the equation is declared. The cause of the equation failure may be a failure of 
any of the sensors in the equation. Figure 21 illustrates how a residual could violate 
predefined thresholds, each of which represents a different confidence level as to the 
existence of a failure. If the three threshold levels are taken to be the standard deviation 
of the residual computed from nominal data, then the probability of failure could be 
assigned by assuming Gaussian statistics. Spikes in the data which tend to cross the 
threshold level for only a single cycle can be filtered from true violations of the threshold 
limits by applying statistical tests. 

Statistical hypothesis tests can be used to define detection limits for sensors 
measuring well-behaved random data. The most common hypothesis tests are 
concerned with the mean and standard deviation of a normal distribution. If the data of 
interest is approximately normally distributed or if the data can be transformed into 
approximately normal random variables, such hypothesis tests could be applied directly. 
Detection limits could be established either to one side or to both sides, using standard 
methodology, and the probability of false alarm and the probability of detection could be 
rigorously determined. The one-sided binomial confidence limit can be used with 
sequences of observed exceedances of the detection limits to interpret the significance 
of the exceedances. If the exceedances are interpreted as not being false alarms, the 
sensor can be confidently classified as failed. 
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Data Calibrated With 320-322 second Time Slice 
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Data Calibrated With 320-322 second Time Slice 

Figure 20. Example Sampled Regression Results of SSME Data 
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Figure 21. The Probability of Sensor Failure Based On Model Results Is 
Related To Statistics Of Residual Vector 
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In any hypothesis test, two types of errors are possible. The Type I error is 
rejecting the hypothesis when it is true. The Type II error is accepting the hypothesis 
when it is false. If the null hypothesis is defined as the condition that the sensor and the 
relevant component are functioning properly, then the probability of the Type I error is 
the probability of false alarm. This probability may be determined whether there is one 
detection limit (one-sided test) or two detection limits (two-sided test). In most cases of 
sensor failure detection, the two-sided limit test would be the correct one to use. 

Similarly, the probability of the Type II error is the complement of the detection 
probability. 

For a particular application, as the probability of false alarm is decreased, the 
probability of detection is also decreased. The detection limits must therefore be set with 
appropriately balanced values of these two probabilities. If the probability of false alarm 
cannot be made negligible, additional logic may be used to interpret the observed 
exceedances relative to the theoretical probability of false alarm. Appropriate 
methodology for the interpretation of the observed exceedances is the one-sided 
binomial confidence limit. 

Fault Isolation 

Following the occurrence of a detected parameter fault (i.e. a failed equation), the 
failed PID(s) must be isolated. An approach suitable for the SDV&SR system is based 
on fault tree logic [3]. In this scheme, a system of equations for the SSME sensor is 
specified such as that shown in Figure 22. An incidence matrix which codes the 
occurrence of independent and dependent parameters in the model is then constructed. 
Rows of the incidence matrix correspond to the equations and each column of the matrix 
represents a PID. The matrix is built by entering a one if a PID is present as an 
independent variable in an equation or a zero if it is not as shown in Figure 22. Each 
column of the incidence matrix represents a fault detection vector for its specific PID as 
shown in Figure 22. If the threshold limits are set such that a failed PID causes a failure 
of all equations in which it appears with equal probability, then single point failure 
detection can be isolated by comparing the vector of failed equations to the each of the 
fault detection vectors for a match. The ability to isolate multi-point failures is dependent 
on the specific structure of the incidence matrix (i.e. the system of equations). 
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Step 1 . 


Define Model: Example 8 equations (2 based on 
hardware redundancy) 12 Parameters 


Model # 
1 

HEX DS PR 

PID34- fnc(40,59,210^21,234) 

2 

OPOVACT POSCIIA 

PID 40 ■ fnc(59, 94,210) 

3 

FPOV ACT POS CH A 

PID 42 - fnc(40.59,94) 

4 

PBPD1S PU CII A 

PID 59 - fnc(58, 94,210) 

5 

PUP DIS TEMP Cl I B 

PID 94 - fnc(40, 42,59,210,234) 

6 

LPOP DISPR CH B 

PID 210 *= fnc(34,90) 

7 

LPOPDIS PR CHA 

PED209*= fnc(210) 

8 

PBP DIS TEMP CH A 

PID 93 ■= fnc(94) 


Figure 22. Fault Tree Logic for Isolating Sensor Failures 
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2.2.4 Pattern Matching Techniques 

Those sensors whose status cannot be effectively determined using linear or non- 
linear regression techniques, or engine characteristic equations, can be analyzed using 
pattern matching tools. Areas of potential application include start, shutdown, and 
power level transients, as well as thermal drifts and other highly non-linear phenomena. 
Pattern clustering appears to be a good candidate for the reconstruction of data which 
cannot be accurately reconstructed by using regression or engine characteristic 
modeling. Pattern matching techniques fall into two basic categories: pattern matching 
algorithms and artificial neural networks. Even though the mechanics of the two 
methods are different, the resultant output is similar for both. Neural networks are useful 
for both sensor validation and data reconstruction purposes and have been 
demonstrated with SSME data [17]. 

The two primary types of pattern matching algorithms are categorized as 
decision-theoretic and semantic. For this study, only the decision-theoretic algorithms 
were investigated. This family of algorithms operates roughly as follows: An exemplar 
pattern of interest is input to the algorithm, along with a sample test pattern. The 
algorithm reduces the two patterns into their respective vector components and 
computes the matching score between the two. The matching score is a statistical 
measure of the relative likeness between the vectors. This technique can be used to 
validate sensors in the following manner: patterns of data, such as the startup transient 
of the MCC pressure, which are known to be good, can be input into the algorithm as 
sample exemplar patterns. As more samples are used to train the algorithm, the 
algorithm is increased. Once the algorithm has "learned" the pattern, suspect data (data 
where no validity determination has been made), can be input into the algorithm. The 
matching score is then computed, and if it falls below a predetermined threshold, the 
sensor (in this case PID 130) is classified as failed. The most well known of the decision- 
theoretic algorithms is the K-nearest neighbor classifier [18, 19, 20]. This algorithm 
works on the principal that the probability of any particular point being part of the pattern 
of interest is directly proportional to a specified number of points nearby (K), and 
inversely proportional to the sample space volume containing k number of points. 

Artificial neural networks appear to be good candidates for both sensor validation 
and data reconstruction. Neural networks are a highly parallel computational 
architecture which is roughly modeled on the physical structure of the brain. The basic 
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building blocks of all artificial neural nets are layers of nodes (neurons) which have 
weighted connections to other nodes. A number of specialized architectures, in which 
the number and configuration of inputs, nodes, layers, connections, and outputs are 
varied, have been developed for a variety of specialized purposes [21, 22, 23], 
Connection weights can either be variable or fixed, depending on which particular 
architecture is selected. Nodes in a particular layer can be connected to specific nodes 
in the adjacent layers, or they can be connected to all of the nodes in adjacent layers 
(fully connected). The first layer of nodes in any net is called the input layer, the last 
layer is called the output layer, and all intermediate layers are called hidden layers. In the 
case of a two layer network, one input is presented to each of the nodes in the first layer. 
Each of the inputs is then output to a node in the following layer as the product of the 
original input value and the appropriate connection weight. These products are then 
combined algorithmically by the second node (known as a processing unit) and that 
output is compared to a target value. The difference of the two is the residual error. As 
in algorithmic pattern matching, if the residual error is within the required threshold, the 
computation is complete and the final output, in this case a sensor signal, is considered 
good. If not the connection weights are modified algorithmically, and the neural net 

process is repeated, and continues until an acceptable residual error level has been 
reached. 

For the purposes of pattern recognition and data reconstruction, the best neural 
net architecture appears to be the multi-layer perceptron [22]. This architecture is better 
known as the back-error propagation or simply the back propagation network. This 
name refers to the algorithm which is used to reset the connection weights after each 
complete pass through the network. Figure 23 shows a flowchart representation of this 
architecture. The perceptron has variable connection weights, is fully connected 
throughout, and uses supervised learning. For the complex pattern matching and data 
reconstruction tasks on this program, at least four layers of nodes (two hidden layers) 
are desirable. The number of nodes in the hidden layers should be three times the 
number of input nodes, so that sufficient pattern definition is achieved [22]. This 
approach is identical to that taken by Guo and Nurre, who were successfully able to 
diagnose a simulated SSME sensor failure and reconstruct the lost data [17]. 

Another architecture which appears to have promise for pattern classification is 
known as competitive learning [21]. This architecture is very similar to the multi-layer 
perceptron, except that the former uses unsupervised learning, and only the weight of 
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the highest valued or "winning- output node is modified during the recursive phase of the 
process. This process continues until the value of the -winning" node no longer 

Cha 9 Due to the complexity of both the neural network and pattern matching/ pattern 
clustering approaches to sensor validation and reconstruction, designing and 
implementing either system from scratch is relatively resource intensive. There are 
several commercial software products, representing both pattern matching paradigms, 
which are currently available and listed in Appendix D. These packages contain 
networks or algorithms which can perform pattern classification tasks. 

3.3 Knowledge Fusion Approaches 

As shown in the previous section, there are several possible sources of 
information about a PID’s failure (Table 9 summarizes the sources of information that 
have been investigated so far). Given all of these pieces of information about a sensor, 
which may be conflicting and have varying degrees of uncertainty associated with them, 
the sensor validation system must be able to make (and justify) a decision about the 
status of each sensor. This is the problem addressed by a sub-discipline of Artificial 
Intelligence referred to as information fusion (also known as evidential reason, ng or 

reasoning with uncertainty ). 

Information fusion involves the combination of evidence from several sources into 
a single consistent model. Uncertainties in the sources of evidence (i.e., inaccuracies in 
the sensors or uncertainties in the fault detection algorithms themselves) are explicitly 
modeled and accounted for. There is a spectrum of information fusion techniques 
available, ranging from computationally efficient but unsound approaches, to those 
guaranteeing semantically correct results but having a high computational overhead and 
implementation complexity associated with them. In addition, for any given technique 
chosen there are typically many algorithms available for implementation. The following 
section will describe and evaluate the four most popular techniques currently used for 
information fusion, and evaluate which is the most appropriate for use in the sensor 

validation system. 

Survey of Techniques . 

Four approaches to information fusion were evaluated for the sensor validation 

system. These approaches were selected based on their frequency of use in fielded 
systems and their mention in the literature. These measures of popularity indicate the 


55 



56 


NAS 3-25883 


degree of confidence held in the techniques by the Artificial Intelligence community. 
These techniques have also been in use long enough to ensure their maturity: The 
MYCIN approach was developed in the 1970’s; Dempster-Shafer theory was developed 
in the 1970’s and has seen wide use and mention in the literature in the 1980’s; and the 
Bayesian Belief Network approach was developed in the 1980’s, although its foundation 
can be traced to the roots of probably theory (1500’s). 

Binary Logic 

Binary Logic represents the most common approach to information fusion, and involves 
decision-making based on hard-coded rules, such as those in NEXPERT. Examples of 
such rules are: 

Voting redundant sensors. 

. Redlining (thresholding). 

Fixed prioritization of sources of evidence. For example: "If two physically 

redundant sensors differ by more than a threshold amount, then suspect the one 

with lower variance." 

Fault tree isolation logic. 

Binary logic does not address uncertainties in the sensors or the sources of 
evidence. More importantly, it is highly susceptible to making wrong decisions (false 
alarms or undetected failures) since exhaustive enumeration of all possible exceptions to 
rules is extremely difficult, if not impossible 1 . The major advantages to binary logic are its 
computational efficiency and ease of implementation (once the rules have been defined). 

MYCIN Certainty Factors 

Several attempts have been made to add the capability to reason with uncertainty 
to rule-based systems. One example of such an approach is MYCIN certainty factors 2 
MYCIN is a rule-based medical diagnostic system. In order to address uncertainties 
both in the observation of symptoms and in the diagnostic rules themselves, the 
developers of MYCIN devised an ad-hoc technique for representing and reasoning with 
uncertainty which could be layered onto their rule-based approach. This approach can 
be summarized as follows: 
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Table 9. Sources of Information about Sensor Failures 


A-Priori Knowledge about 
Likelihood of Failures 
Reliability of Sensor Gass 
Sensor Failure History 
Sensor Time in Service 
Pre- and Post- Test Calibration 

Physical Redundancy 
Alternate Sensors 

Known Failure Mode Analysis 
"Universal" Failure Mode*; 

Hard Open Circuit 
Intermittent Open Circuit 
Short Circuit / Shutdown 
Spikes 

Speed and Flow Sensors 

Reasonableness Checks 
Red, Yellow, and "Reasonable" Lines 
Rate of Change 
Standard Deviation 

Signal Analysis 
Moving Average 
Time Series 

Aliasing 

Pressure Sensors 
Thermal Drift 
Overshoot 

Loss of Reference Vacuum 

Temperature Sensory 
Thermal Expansion 

Analytical Redundancy 

Empirical Correlation Models 
Engine Characteristic Models 
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Each proposition (fact) in the knowledge base has a certainty factor associated 
with it, which is a real number between -1 (indicating that the fact is definitely 
false) to + 1 (indicating that the fact is definitely true). 

Each rule has an attenuation , a real number between 0 and 1 , indicating the 
uncertainty in the rule (a value of 1 indicates that if the rule's antecedents are 
known with absolute certainty, then the rule’s consequents can be concluded with 
absolute certainty). 

Given antecedents a 1, a2, ,an, and a consequent c for a rule: 

Certainty [c] = Minimum(Certainty[ai], Certainty[a2],—Certainty[a n ] ) 

Attenuation 

If two rules assert certainty factors for the same proposition, the resulting certainty 
factor is found by: 

x + y-xy ifx,y > 0 

(x + y)/(1 - Minimum(x,y)) ifx,y different sign 
x + y + xy ifx,y < 0 

Where x and y are the certainty factors assigned by the two rules. 


This approach is better than binary logic in that it attempts to deal with uncertainty 
in an explicit way, and provides a means for combining multiple sources of evidence. As 
with binary logic, this approach is also computationally efficient and straightforward to 
implement (it can easily be added onto a NEXPERT rule base). However, since the 
approach is based on ad-hoc formulas there are cases in which it will produce non- 

intuitive results. 

One case in which it will give incorrect results is when the sources of evidence 
contributing to a proposition are correlated. An example of this from the Chernobyl 
disaster is shown in Figure 24 (the example is due to Henrion^) . Pearl says about this 
example, "Multiple, independent sources of evidence would normally increase the 
credibility of the hypothesis (Thousands dead ), but the discovery that these sources 
have a common origin should reduce the credibility. Extensional systems are too local 
to recognize the common origin of the information, and they would update the credibility 
of the hypothesis as if it were supported by three independent sources." 4 

Dempster-Shafer Theory 

The Dempster-Shafer theory of evidential reasoning 5 has experienced a wide 
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popularity in the Artificial Intelligence community in the last 10 years. In contrast with 
approaches such as MYCIN certainty factors, it is a mathematically sound approach. 

The Dempster-Shafer formalism maintains a body of evidence about a set of 
mutually exclusive hypotheses (in sensor validation, the hypotheses would be of the 
form PIDi Failed ). In this theory, a source of information can assign probabilities to 
disjuncts of hypotheses. For example, if evaluation of the engine characteristic 

LPOPH/N2 = (P209-P860) /P3QT2 

at a particular time point indicates that the characteristic is anomalous (e.g., the residual 
is greater than 3 sigma), the following probability assignment could be made: 

Confirmf {p209,p860,p30}, 0.9) 

This assignment indicates that either P209 or P860 or P30 has failed with a 0.9 
probability. A formal method, Dempster’s Rule of Combination, exists to combine two 
statistically independent bodies of evidence formed by statements of the form shown 
above. Once all sources of evidence have been combined, the Belief and Plausibility of 
any disjunction of hypotheses can be found. 

Belief in a hypothesis is the probability that a logical proof for the hypothesis 
exists (i.e. if evidence assignments are interpreted as constraints, this is the 
probability that the constraints allow the hypothesis to be deduced). This can 
also be interpreted as the degree to which the evidence supports the proposition. 
Plausibility of a hypothesis is the probability that it is compatible with the evidence 
(i.e. the probability that it cannot be disproved and is therefore possible). Thus, 
this is the degree to which the evidence fails to refute the proposition. 
Plausibility(H) = 1 - Belief (H) 

Thus, once all sources of information about a sensor have been combined, the Belief in 
each sensor’s failure hypothesis could be examined and acted upon if over some 
threshold. 
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Figure 24. Chernobyl Disaster Example Shows Why Rules Cannot Combine 
Locally 
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Dempster-Shafer Theory has several disadvantages when applied to the sensor 
validation information fusion problem: 


The theory assumes that all sources of evidence are statistically independent (this 
is not true if any PIDs are used in more than one test). 

The theory assumes that exactly one of its hypotheses is true; thus, it will not 
detect multiple-point failures. (However, if the method is used for every time slice 
of data this should not be a problem, since the probability of having more than 
one failure at a given instance is extremely small.) 

Each application of Dempster’s rule can be computationally very expensive. 
Direct implementation is exponential in the number of hypotheses, although 
approximate solutions have been developed. 

Finally, Dempster-Shafer theory is good for reasoning about problems in which 
constraints are stated explicitly, such as in design or planning, since the objects 
that are reasoned over are constraints among its hypotheses. 

Bavesian Belief Networks 

Bayesian probability theory, like the Dempster-Shafer theory described above, is 
mathematically sound. However, prior to the development of graphical representations 
and efficient network solution algorithms, its application to non-trivial problems (with 
more than a few dozen random variables) was extremely awkward if not intractable. 
Graphical representations of joint probability distributions provide a very intuitive 
knowledge representation format, and the Bayesian network formalism allows the 
requisite probabilities to be specified in a very concise and painless manner. 

A Bayesian network consists of nodes which represent discrete-valued random 
variables. Examples of such nodes in the sensor validation system are shown in Figure 
25. The node/random variable-P209-represents the current state of PID 209, and can 
be in one of five mutually exclusive states: OK, Hard_Open, IntermittentjDpen, Drift, or 
Bias. Associated with this node is a probability distribution which describes the 
probability of the node being in each of its possible states given all available information. 
The LPOPH/N2 node represents the outcome of the engine characteristic test: 
LPOPH/N2 = (P209-P860) /P30~2 
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This is computed from a time slice of data and compared to a sampled baseline 
characteristic. The states for this node represent the residual from comparison, and 
thus the outcome of the test. 

Directed arcs lines between nodes in a Bayesian Belief Network represent 
influences. In particular, an arc from node A to node B indicates that knowledge of node 
A’s state can change the probability distribution for node B. Figure 26 shows the three 
arcs influencing node LPOPH/N2, namely those coming from the three PIDs involved in 
the test (a change in the status of any of the PIDs involved can change the outcome of 
the test). The nodes and arcs in a Belief Network must form a Directed Acyclic Graph 
(DAG), that is, the nodes can be connected in any manner as long as you cannot start at 
a node and get back to the same node by following directed arcs through the network. 

Once the topology of a network has been defined, two types of probabilities must 
be specified to complete the network. First, all nodes which do not have any influencing 
arcs (i.e., no arcs coming into them) must have default probability distributions for their 
states defined (P209 in Figure 27 shows an example of this). In the Belief Networks used 
for sensor validation, these nodes typically represent random variables specifying the 
status of each PID. The default probability distributions would be obtained from 
historical reliability data for each sensor (e.g., PID xyz has exhibited a 0.99 reliability over 
the last 30 tests with a 0.005 probability of failing hard open circuit and a 0.005 
probability of failing by drift), coupled with time in service, and pre- and post-test 
calibration. 

Second, every node which has influencing arcs must have probability distributions 
conditioned on the states of their influencing nodes specified (see LPOPH/N2 in Figure 
27). In the Belief Networks used for sensor validation, these nodes typically represent 
random variables specifying the outcomes of diagnostic test. The probabilities 
distributions can be obtained analytically by analysis of each test used. 

A fully specified Belief Network can be used to answer queries in the following 
manner: 

1 ."Observables" are instantiated (in the example above, this consists setting the state 
of the LPOPH/N2 node to reflect the test outcome). 

2. A network update algorithm is run.®' 7,8 

3. The probability distributions of nodes of interest are examined (in the example 
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Figure 25. Example Bayesian Belief Network Nodes 


64 




STATUS 



Figure 26. Example Belief Network Influences 
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above, this consists of examining the distributions for P209, P860, and P30 to see if 
the probability of any fault state exceeds a threshold). 

The major advantage of Bayesian Belief Networks is that they are the most 
semantically correct way to perform information fusion. For analysis problems such as 
diagnosis, Bayesian probability theory is better suited than the Dempster-Shafer 
approach, since the objects that it reasons over are probabilistic models (there are no 
explicit constraints known a-priori). 9 However, the Belief Network solution algorithms 
can be complex to implement and computationally expensive. 

Evaluation of Techniques 

Table 10 shows the results of a trade study on the four techniques discussed 
above for use in sensor validation information fusion. In the trade, the "soundness" 
criterion was given the highest weighting because the whole purpose of information 
fusion is to give the best possible evaluation of all sources of information. The "Ease of 
Implementation" criterion includes not only implementation of the fusion algorithm, but 
the encoding of all requisite knowledge to perform the sensor validation information 
fusion task (i.e., specification of probabilities, certainty factors, logic rules, etc.). Based 
on this trade, Bayesian Belief Networks are the recommended approach to information 
fusion for sensor validation. 

Application to Sensor Data Validation 

Figure 28. shows how all of the information available about the state of P209 
(LPOP discharge pressure) might be integrated using Belief Networks. Given the 
research performed in Phase I, it is known that P209 can be evaluated by two 
characteristic tests (LPOPH/N2, HPOPR2), by an empirical test (relating P209 to P21 1 
and P91), by range tests (e.g., 2sigma bands), by pattern-matching techniques which 
look for specific failure modes such as spikes and drifts, and by comparison to P210 
(channel B). In addition, information about P209’s failure history, time in service, pre- 
and post-test calibration, and the reliability of the transducers used to measure LPOP 
discharge pressure can be combined into an initial probability distribution for P209 and 
combined with the evidence gathered for each time slice of the test data analyzed. 

The specification of the Belief Networks needed should be very straightforward. A 
preliminary analysis of the networks required indicates that most of the probabilities, and 
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p(LP0PH/N2 State \ P209 State. P860 State, P30 State) 

p(Residual<3sigma|P208«OK t P90»OK,P121 2-OK) 0.997 

p(Residuafe3sigma| P208-OK, P90-OK, P 1 21 2-OK) 0.003 

p( Residual<3sigma| P208-OK, P90-OK, P 1 21 2* Failed) 0.500 

p(Residual>3sigma|P208=OK,P90=OK,P1 21 2-Failed) 0.500 

p(Residual<3sigma|P208-OK,P90«Failed,P1212-OK) 0.100 

p( Residuate3sigmaj P208-OK, P90- Failed, P 121 2-OK) 0-900 

p(Residuak3sigma|P208=OK,P90«Failed,P1 21 2-Failed) 0.100 

p(Residuate3sigma|P208=OK,P90-Failed,P1212-Failed) 0-900 



Figure 27. Example Belief Network Probability Specification 
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Figure 28 . Information Fusion Techniques Trade Offs 



FAILURE MODE 
PATTERN MATCHES 



Figure 29. Example Belief Network Segment Around P209 
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possibly the network topology itself, can be automatically compiled from descriptions of 
the various sources of information (e.g., engine characteristic and empirical PID 
relations). 
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4.0 System Software Specification 

4.1 Scope 

4.1.1 System Objective 

The major objective of the Sensor Validation system is to automatically detect 
sensor malfunctions in the SSME test data, and to reconstruct the data for any 
malfunctioned sensors using alternate sources of information. The system will run in two 
major stages; an initial batch processing mode, followed by an interactive post- 
processing mode (see Figure 30). In the batch mode, the sensor data from the SSME 
test (in engineering units) is thoroughly analyzed by the sensor validation system, with 
PID failures flagged and PID value reconstructions automatically run. The purpose of the 
interactive mode is to allow analysts to quickly understand the results of the batch mode 
processing, and either confirm or override the failure and reconstruction decisions made 

by the sensor validation system. 

Batch Processing Mode 

In the batch processing mode the SSME data is analyzed and acted on according 
to three user-specified thresholds: 

Report threshold - the system will write a report whenever the estimated 
probability of any sensor failing crosses this threshold (in either direction). The 
report will be produced in two parts: a brief summary stating which PID(s) changed 
state and when, and a detailed report describing how the system arrived at the 

estimated probability. 

Reconstruction threshold - the system will reconstruct the value for a sensor using 
alternate sources of information whenever its probability of failure exceeds this 
threshold. If several viable methods for reconstruction exist, the system will pick the 
method with the highest probability of being correct (based on the failure 
probabilities of any other sensors involved in the reconstruction and the accuracy of 

the method). 

Failure threshold-when a sensor’s probability of failure exceeds this threshold, the 
system will assume that its value cannot be used to cross-check other sensors or in 
reconstruction of other sensor values. 
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Although users are free to set the thresholds at any values, it is expected that 

typically the following relative settings will be used: 

Report <. Reconstruction <. Failure 

Using these settings, the system would typically generate reports whenever a 
sensor exhibits any questionable behavior. Additionally, reconstructed values will be 
produced in cases when the system does not conclude that the sensor has failed, but 
simply derives an unusually high probability of failure. Thus, the user still has pre- 
computed reconstructed values to use in case he or she overrides the system’s 

judgement about the failure status of a sensor. 

Batch mode processing is expected to take place immediately following an SSME 
test, with results available within an hour for use by the rest of the Life Prediction System 
and for use in Interactive mode analysis. Thus, this processing will not exceed one hour 
in duration on a Sun SPARCStation. 

In batch mode, all sources of available information will be analyzed and fused to 
reach the best possible decision about the status of a sensor. The tests incorporated 
into the initial version of the sensor validation system will include: 

Empirical models 

Characteristic models 

Red-line test 

Sensor class reliability 

Sensor failure history 

Some form of pattern-matching 

Comparison with redundant channels 


The system will be constructed in such a way so that tests can be added or 
modified with minimum effort. Bayesian Belief Networks will be used as the approach to 
information fusion. An approach to specifying the networks will be developed which 
minimizes the effort required to define the networks) required and the associated 
probabilities. 

In addition to flagging sensor failures and performing reconstructions, the batch 
mode software will determine the best source of information to use for each physical 
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measurement (e.g., channel A, channel B, reconstruction method 1, reconstruction 
method 2, family-averaged historical value, etc.). This designation can then be used by 
other modules in the SSME Life Prediction system (e.g., expert diagnostic modules) so 
that they only need to look at a single source of data for each measurement and not 
concern themselves with evaluating the different possible sources of information. Thus, 
this provides a form of data reduction for the entire Life Prediction system. 

Interactive Post-Processing Mode 

The Interactive post-processing mode is intended to provide an analyst with an 
environment in which he or she can quickly understand the conclusions reached by the 
system during its batch processing, and either confirm or override the decisions made 
by the system. The analyst will also be able to display arbitrary PID value vs. time plots, 
and run any available reconstruction algorithm. The interactive software will make 
maximum use of a mouse-driven, color, graphical user interface to convey the sensor 
validation system’s results as efficiently as possible, and to minimize the analysts’ 
learning time. 

The post-processing software will have three main displays (in addition to a "main 
menu" for specifying test numbers, top-level operations, etc.). The first display will show 
a color plant diagram of the SSME with icons representing all CADS PIDs (see Figure 
30). Sensors which had been flagged as failed during batch processing (according to 
the failure threshold) will be highlighted on the display. This display gives the analyst a 
quick, global view of problems detected by the sensor validation system. In addition, 
the highlighting can reflect a single instant in time during the test, and the analyst will be 
able to move a scrollbar along the bottom of the display to advance forward or backward 
in time to get a quick feel for the chronology of events during the test. If a PID icon is 
clicked on with the mouse, a pop-up window will appear showing a brief summary of the 
PID’s status. Further, if this window is clicked on a justification display will appear to give 
a complete description of the sensor validation system’s evaluation of that PID at the 
indicated time. 

The justification display is the second main display in the interactive system (see 
Figure 31). When requested by the analyst, a display will appear showing a verbal 
description of the evaluation of a PID at a specific time, and any supporting graphics 
(e.g., plots) will also be displayed. 
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Figure 31. Interactive Plant Display 
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Figure 32. Interactive Justification Display 



Figure 33. interactive PID Matrix Display 
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The third main display in the interactive system is the "PID Matrix" (see Figure 32). 
This display shows a concise summary of the system’s conclusions and 
recommendations, and allows the analyst to override any of the entries. Each PID is 
displayed on a timeline, with the color of the display indicating the status of the PID (e.g., 
green for OK, red for failed). Although the initial display represents the validation 
system’s batch mode conclusions, the user can click the mouse on any segment of the 
display and modify the status of the sensor. Additionally, if the sensor is declared failed, 
the user can specify the reconstruction method used and whether a UCR should be 
generated or not. 

Once the analyst OKs the PID Matrix, the sensor validation system automatically 
assembles a test data file with all selected reconstructions, generates all requested 
UCRs, and generates a final report. The report includes the justifications for all failed 
sensors (including text and graphics), and is editable by the analyst using SunWrite. 

Global Objectives 

In addition to the objectives mentioned above, the following objectives apply: 
Although the Sensor Validation system will eventually have to interact with the 
Session Manager to obtain its data and to interface with the user, it will initially be 
designed as a stand-alone system since it is the first module in the SSME Life 
Prediction system planned to be completed. However, a clear migration path 
from standalone to embedded processing will be maintained. 

The sensor validation system will be designed so as to minimize the effort 
required to modify engine data (e.g., PID lists, sensor specifications, sampling 
rates, etc.). 

The sensor validation system will be kept as engine-generic as possible so as to 
minimize effort in porting the system to a different engine (e.g., the SBE). 

4.1.2 Hardware 

The sensor validation system will be implemented on a Sun SPARCStation. 

4.1.3 Software 

The sensor validation system will be implemented using the following software 
languages, tools, and environment: 

Operating System: Unix 
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Procedural Language: C 

Expert System Shell: N EXPERT Object 

Windowing/Graphics System: Motif or Dataviews 
CAE Package: PV-WAVE 

4.1 .4. Human Interface 

The human interface to the system will provide a graphical, mixed-initiative 
interface to the set of tools that the sensor validation system provides. Point-and-click 
functionality will be used throughout (including the use of pop-up menus) to initiate all 
functions so that the analyst does not need to remember commands and their 
parameters. Different activities (e.g., plant display, justification, default modification, etc.) 
will take place in separate windows so that the user can visually cross-reference 
information when desired. A mixed-initiative interface will be used so that at any time 
either the system can lead the user (e.g., with a suggested action or a query) or the user 
can direct the system (e.g., with a new command or volunteered information). 

The display format of data (e.g., test data plots) will adhere as closely as possible 
to the formats used in current hardcopies to minimize the users’ effort in orienting to the 
system. 

The user will be able to index PIDs either by PID number, by Rocketdyne number, 
by label (e.g., "MCC COOLANT DISCH PRESS CH A1"), or by clicking on the 
appropriate plant display icon. 

Stylistically, the system will adhere to the OPEN LOOK Graphical User Interface 
specification through the use of Sun’s OpenWindows window system. 

4.1.5. Major Software Functions 

Batch Mode 

Import and partition Engine Test Data. The system will import test data from an 

Ingres data base and partition it into steady-state and transient intervals. 

Import PID Reliability and Failure History. The system will import the failure history 

for all PIDs and the reliability figures for all PIDs from an Ingres database, and 

integrate this information into its decision about the status of PIDs. 
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Import Family-Averaged Models. The system will import all summarized 
characteristic and empirical (regression equation) information about previous 
tests from an Ingres database, and integrate this information into its decision 
about the status of PIDs. This information includes: 


True Engine Characteristics. 

Empirical model constants. 

PID value means and standard deviations as a function of power level (or 
computing yellow and red-lines, and as a “last resort" method for reconstruction). 


Assess sensor status. The system will determine the probability of each PID's 
likelihood of failure at each time point in the test data. 


Report sensor failures. Whenever a sensor’s probability of failure exceeds a user- 
specified Report Threshold, a brief statement will be output to a Batch Report file, 
and a detailed justification of the probability assessment will be output to a 
Justification Data file. The Justification Data file can not only contain textual 
descriptions, but specifications for generating supporting plots as well. 

Reconstruct sensor values. Whenever a sensor’s probability of failure exceeds a 
user-specified Reconstruction Threshold, the system will reconstruct the sensor’s 
value from that time point until the end of the test using alternate sources of 
information. There are different thresholds for reconstruction and failure to 
support efficient interactive processing, so that an analyst can use reconstructed 
data for a "borderline" sensor, even though the system did not declare it as failed. 
If several viable reconstruction methods exist, the system will pick the one with the 
highest probability of being correct (based on the failure probabilities of any other 
sensors involved in the reconstruction and the accuracy of the method). Once 
reconstruction has started, the method used may be changed dynamically as the 
probability of other sensors (i.e., those used in the reconstruction method) 
change. All reconstructed data will be output to an Ingres database. 

Conclude sensor failure. Whenever a sensor’s probability of failure exceeds a 
user-specified Failure Threshold, the system will not use the sensor’s value in any 
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further reconstructions or cross-checks for other sensors. In addition, a brief 
statement describing the failure will be output to the Batch Report for use in the 
Interactive Mode. 

Generate Engine-Specific Models. Characteristics (as described in the Import 
Family Characteristics function above), empirical model constants, and mean and 
2-sigma values, computed and averaged per power level for the engine under test 
will be written to an Engine Characteristics Ingres database. 

Select best source of information. For each physical measurement (e.g., PC, 
LPFP DISCH PRESS, etc.), the system will determine the best source of 
information at every time point based on its sensor analyses. The sources may 
include not only redundant channels, but any available reconstruction methods. 
This information will be output to the Batch Report file. 

Generate validated data set. When this option is selected at the start of the batch 
mode processing, the system will assemble a final validated data set, integrating 
real and reconstructed values according to the user-specified Failure threshold 
(i.e., whenever the probability of a sensor’s failure exceeds this threshold it is 
replaced with the best reconstruction method available). 

Interactive Mode 

Display PID Matrix. The system will display a graphical matrix indicating the 
assessed status of each PID at each time point in the test (e.g., a green bar will 
indicate that the PID is functioning normally, while a red bar will indicate a failure). 
When an entry is clicked on with the mouse, a popup window will appear showing 
a brief summary of the PIDs status and, if a failure is indicated, the reconstruction 
method used and whether a UCR will be issued or not (see Figure 32). The 
matrix will be initialized from data in the Batch Report, but the user can modify any 
of the entries via the popup window. If the user changes a PID's status, all 
ramifications of this must be determined by the system (in particular, if the user 
declares a PID as failed, then any reconstructions based on that PID must be 
invalidated). In addition, the popup window will have a “Justify" button which will 
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cause a justification for the system’s assessment to be displayed if it is clicked on 
(see Display Justification below). 

Display plant summary. The system will display a picture of the SSME plant 
diagram with all PIDs depicted by icons which indicate assessed status (e.g., a 
red icon will indicate a sensor failure, while green will indicate that the sensor is 
OK). The plant display will have two modes: summary and chronological. In 
summary mode the maximum failure probabilities over the duration of the test will 
be used for the icon display (i.e., the display indicates the worst-case status of all 
PIDs over the duration of the test). In chronological mode, the user will be able to 
step forward or backward through the test time by moving a scrollbar along the 
bottom of the plant display (see Figure 31) with the icons updated so as to 
display their status at that point in the test. If an icon is clicked on, a summary of 
its status over the duration of the test will appear in a pop-up window. If one the 
entries in this summary is clicked on, a justification for the system’s assessment is 
displayed (see Display Justification below). The plant summary is generated from 

information in the PID Matrix. 

Display Justification. The system will display the justification data for a given PID 
assessment when requested by the user. The justification information will be 
imported from the Justification Data file generated during batch mode. Text and 
supporting graphics (i.e., plots) will be displayed in separate windows. 

Plot Generation. The system will plot any PID value, reconstructed PID value, or 
any combination of these over any requested time interval. If a reconstruction is 
requested which was not run during batch mode, the reconstruction is run 
immediately using information from the engine test data, and the family and 
engine characteristics databases. 

Authorization. Once the user is satisfied that the PID Matrix is correct, he or she 
can authorize the system to complete its processing. This includes the following 

functions: 

- The system generates all UCRs specified in the PID Matrix. 
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- The system updates the Failure History database according to the sensor failures 
indicated in the PID Matrix. 

- The system updates the Family Characteristics database using the information in 
the Engine Characteristics database output in batch mode, and using information 
in the PID Matrix to determine what should not be updated (due to sensor 
failures). 

- A final report will be generated. This includes a brief textual summary describing 
all PID failures, followed by all justification information for each PID failure. The 
report will be output in SunWrite format so that it can be edited by the user. 

4.2. Reference Documents 

IR&D Proposal AMP 91-03: Integrated Controls & Health Monitoring, Aerojet 
NEXPERT Object User’s Manuals (vol I and II). Neuron Data, Inc. 

OpenWindows 1.0 User’s Guide, Sun Microsystems 

XView 1 .0 Reference Manual: Summary of the XView API, Sun Microsystems 

SunWrite 1.1 User’s Guide, Sun Microsystems 

SunOS Reference Manual (vol I, II, and III), Sun Microsystems 

PV-WAVE User’s Manual 

Programming Language C, X3. 159-1 989, ANSI 

4.3. Preliminary Design Description 

4.3.1. Batch Mode 

Data Flow 

Figure 33 shows the top-level, formal data flow diagram for the batch mode 
processing modules in the sensor validation system. Figure 34 shows the next level data 
flow diagram for the Sensor Failure Detection module. The software modules in these 
diagrams are described next. 

Software Modules 

Steady-State/Transient Partitioning. The test data will be partitioned into steady- 
state and transient intervals. 
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Figure 35. Sensor Failure Detection Module Data Flow Diagram 
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Model Sampling. Data samples required by the characteristic and empirical 
models (and any other models requiring engine-specific tuning for other tests) wi 

be taken from each data set partition. 

Sensor Failure Detection. The probability of failure for each PID at each time point 
will be assessed based on the fusion of all available information and tests and 
output to the PID Status table. For each crossing of the Report Threshold, a 
justification will also be output to the Justification Data file. 

Characteristic Evaluator. The characteristics for the engine under test (including 
true engine characteristics, and PID means and standard deviations) are 
computed and output to the Engine Characteristics database. 

Reconstruction. For each crossing of the Reconstruction Threshold a 
reconstructed sensor value will be generated and output to the Batch Mode 
Reconstructed Data database. 

Report Generation. The contents of the PID Status table will be output to the 
Batch Report file. 

In-Family Test. Each engine-specific characteristic sampled by the Model 
Sampling module will be compared with family values in the Family Characteristics 
If the engine-specific characteristic is -out-of-family- then that 
characteristic will not be used for further sensor assessment, and information 
about the out-of-family condition will be passed to the Information Fusion 

module. 

Time-Slice Partitioning. All PID values for a given test sampling time will be 
extracted for use by the various assessment tests. 

Characteristic Tests, Empirical Tests. All viable characteristics and empirical 
models will be evaluated and compared to their corresponding sampled baseline 
values. Information about the degree of disagreement (i.e., the size of the 


85 


NAS 3-25883 


residual) will be passed to the Information Fusion module. Whenever the 
probability of a sensor’s failure exceeds the Failure Threshold (based on the PID 
status table), tests involving that sensor’s values will no longer be used. 

Pattern Matching. Any pattern-matching techniques will be run, with the results 
passed to the Information Fusion module. 

Information Fusion. All sources of information about the status of each PID at 
each time point in the test will be fused together using the technique of Bayesian 
Belief Networks. The resulting probability of failure for each PID will be compared 
to the Reporting Threshold and, if exceeded, justification data based on the tests 
involved in the assessment and the Bayesian analysis will be written to the 
Justification Data file. The best source of information to use for each physical 
measurement, and the best reconstruction method to use for each PID are also 
determined. 

4.3.2. Interactive Mode 

Data Flow 

Figure 35 shows the formal data flow diagram for the interactive mode processing 
modules in the sensor validation system. The software modules in this diagram are 
described next. 

Software Modules 

Plant Browser. Performs the "Display Plant Summary" function described in 
section lll.l.e. 

Justification Browser. Performs the "Display Justification" function described in 
section lll.l.e. 

PID Matrix Authorizer. Performs the "Display PID Matrix" function described in 
section l.e. 

Plot Generator. Performs the "Plot Generation" function described in section 1 .e. 


86 



Justification 



87 


ORIGINAL PAGE IS 
OF POOR QUALITY 


Figure 36. Interactive Mode Data Flow Diagram 













NAS 3-25883 


Family Update. Some or ail of the characteristics for the engine under test will be 
added to the Family Characteristics database, once the PID Matrix has been 
authorized. 

Reconstruction. PID values not reconstructed during batch mode may be 
generated in real time if requested by the user. 

Failure History Update. The PID Reliability & Failure History database will be 
updated with the failed PIDs in the PID Matrix, once it has been authorized. 

UCR Generator. UCRs will be generated as specified in the PID Matrix, once it 
has been authorized. 

Validated Test Set Generator. A final engine test data set will be assembled from 
real and reconstructed PID values according to the PID Matrix, once it has been 
authorized. 

Report Generator. A final report will be generated, consisting of a brief summary 
of all failed PIDs, followed by a justification for each failure assessment. The 
report is based on the PID Matrix and the Justification Data file, and is generated 
once the PID Matrix has been authorized. 

4.3.3 External File Structure 

Engine Test Data Set - Ingres database containing the raw data from the test 
under analysis, with all values converted into engineering units. 

PID Reliability & Failure History - Ingres database containing the reliability 
(manufacturer’s statement) of each sensor, in addition to the failure history for 
each particular PID. 

Family-Averaged Models - Ingres database containing the characteristics, 
summarized at each power level, for all engines. 
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Engine Models - Ingres database containing the characteristics, summarized at 
each power level, for the engine under test. 

Batch Mode Reconstructed Data - Ingres database containing all reconstructed 
PID values (same as Engine Test Data Set, except that the start and stop times 
and reconstruction method used are also recorded). 

Justification Data - An ASCII file containing text and plot generation information for 
each PID whose probability of failure exceeded the Report Threshold. 

Batch Report - ASCII text file, formatted for ease of reading into the interactive 
mode system, but also sufficiently annotated to make it usable as a hardcopy 

report. 

Validated Test Data - Ingres database; same format as Engine Test Data Set. 

Thresholds - Text file containing the report, reconstruction, and failure threshold 
values. 

4.4 Test Provisions 

The sensor assessment capabilities of the system will be evaluated by the 
following methods: 

Review of heuristics and strategies with experts. 

Running several test cases through the system, using real or simulated failures as 
necessary to obtain broad test coverage. 

Running two new test cases provided by NASA LeRC through the system, 
in the system will be evaluated during 


In addition, the overall capabilities of the system, including the interactive mode 
user interface, will be evaluated during demonstrations (as scheduled in section IV) to 
members of NASA LeRC. 
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5.0 SYSTEM DEVELOPMENT PLAN 

The Phase II development plan for the Sensor Validation system is shown in 
Figure 36. The development spans two and one-half years, at the end of which a fully 
functional software module, integrated with the Session Manager of the SSME Life 
Prediction system, will be delivered. The Sensor Validation system architecture, as 
described above, allows for incremental addition of validation tests, thus validation 
techniques can be developed and implemented by several groups (e.g., neural network 
and time series approaches by NASA Lewis) and integrated into the system before 
delivery. 

5.1.1991 Development Tasks 

Batch System Design, NASA Review 

The architecture of the Batch Processing system will be designed. The result of this task 
will be a detailed design document, specifying data structures, software modules and 
their interfaces. This design document will be reviewed and approved by NASA before 
implementation proceeds. 

Information Fusion Implementation 

The Information Fusion module will be implemented, using the Bayesian Belief Network 
approach. This module will consist of procedure calls to define the network, to run the 
update/solution algorithm, and to extract results (probability distribution for any node in 
the network). 

Redline and Redundant Channel Test Implementation 
The Redline and Redundant Channel Test modules will be the first validation techniques 
implemented, since they are the most straightforward to implement and their results are 
easily verifiable. In addition to the test modules themselves, the Steady-State/Transient 
Partitioning, Time-Slice Partitioning, and Batch Report Generation modules will be 
implemented and integrated with the Information Fusion module so that the test modules 
can be fully tested. The Redline and Redundant channel tests will be fully functional for 
all 114 critical PIDs described in Section 3.1.1. 
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Analytic & Empirical Test Framework Implementation 

The Analytic and Empirical Test modules will be implemented. This initial implementation 
will utilize family-averaged characteristic values (the Model Sampling module will not be 
implemented), and the tests will only be used on steady-state test data segments. In 
addition, only those models developed in Phase I will be implemented (covering 
approximately 50 PIDs). The Model Sampling module, characteristic database, and 
remaining models will be developed in 1992. 

Plant Display Tailoring 

An existing prototype of the Interactive Mode Plant Browser will be modified to display 
the results of the Batch Processing system for all 131 PIDs. 

Review and Demonstration 

A final review will be conducted at Aerojet Propulsion Division in Sacramento, California. 
The review will include a demonstration of the Sensor Validation system on the test data 
sets currently in Aerojet’s possession. A report of the 1991 activities will be delivered to 
NASA TBD days following the final review. 

Summary 

In 1991 a minimally-functional Batch Processing system will be developed, which will 
perform validation on all 131 critical PIDs. A very simple graphical user interface 
(consisting of the Plant Browser module of the Interactive System) will be developed for 
displaying the results of the Batch processing. This development is expected to take 
approximately 5-1 /2 person-months of effort. 

5.2 1992 Development Tasks 

Interactive System Design, NASA Review of Design Document 
The architecture of the Interactive Processing system will be designed. The result of this 
task will be a detailed design document, specifying data structures, software modules 
and their interfaces. This design document will be reviewed and approved by NASA 
before implementation proceeds. 
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PID Matrix Implementation 

The PID Matrix Authorizer module will be implemented, allowing the analyst to browse 
and modify the results of the Batch Mode processing. 

Plot Generation Implementation 

The Plot Generation module will be implemented for plotting PID values (as requested by 
the analyst) and for displaying graphical elements of Batch Mode justifications. 


Analytical & Empirical Modeling 

The Model Sampling module in the Batch Processing system and the Family Update 
module in the Interactive Processing system will be implemented, in addition to the 
Family and Engine characteristic databases. A complete set of analytical and empirical 
models, covering all 1 14 critical PIDs, will be developed and implemented. 

Reconstruction Implementation 

The Reconstruction module (used in both the Batch and Interactive systems) and the 
Validated Test Set Generator modules will be implemented. 

Evaluation on Test Cases 

The Sensor Validation system (the Batch Mode module of which will be essentially 
complete) will be evaluated on test cases provided by NASA Lewis. 

Review and Demonstration 

A final review will be conducted at NASA Lewis Research Center in Cleveland, Ohio. 
The review will include a demonstration of the Sensor Validation system on the test 
cases provided by NASA. A report of the 1992 activities will be delivered to NASA TBD 
days following the final review. 

Summary 

In 1992 the Sensor Validation system will be complete, except for the ability to justify 
conclusions, and without the integration of validation techniques developed by other 
groups. At this point, the system can be fielded at MSFC for initial evaluation. 
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5.3. 1993 Development Tasks 

Justification Generation and Display 

The ability for the Batch Mode system to generate justifications, and for the Interactive 
Mode system to display them will be implemented. 

Report Generation 

The Report Generation module in the Interactive Mode system will be implemented to 
produce SunWrite-editable reports describing and justifying the conclusions reached by 
the Sensor Validation system. 

UCR Generation 

The UCR Generation module in the Interactive Mode system will be implemented to 
produce UCRs as requested by the analyst. 

Integration of Tests 

Sensor validation techniques developed by other groups will be integrated into the final 
sensor validation system. This includes extending the Information Fusion module to 
incorporate the results from these tests. 

Integration with Session Manager 

Both the Batch and Interactive mode systems will be integrated with the Session 
Manager, so that the Sensor Validation system can be run from the unified Life 
Prediction system interface. 

Review and Demonstration 

A final review will be conducted at NASA Lewis Research Center in Cleveland, Ohio. 
The review will include a demonstration of the Sensor Validation system on additional 
test cases provided by NASA. A report of the 1993 activities will be delivered to NASA 
TBD days following the final review. 

Training 

One-week of on-site training will be provided to analysts wishing to use the Sensor 
Validation system. The training will not only cover how to run both modes of the system, 
but will cover how to modify the system’s parameters (i.e., Batch thresholds, Test 
parameters, and Belief Network probabilities). 
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Appendix A: INTERVIEWS WITH DATA VALIDATION EXPERTS 
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A trip was taken to NASA MSFC on 4-7 June to meet with SSME data 
analysts and engine experts. The interviews were conducted with both Marlin 
Marietta and NASA personnel. In addition two post test data review were attended. 
The results of these interviews have been used to prepare the Users Requirements 
Document (URD.) In addition, interviews have been conducted with Aerojet 
personnel knowledgable with sensor data validation. 

A brief summary of key interviews and comments from the trip to NASA MSFC 
are given below. The following questions were asked of the NASA MSFC personnel 

during the interviews. 

(1 ) What is the current data validation procedure? 

(2) How is previous test data used in the validation procedure? 

(3) How are test stand, engine, and component variations accounted for? 

(4) What types of computer interfaces are used or would be most useful 
(i.e. hardcopy plots, databases, on-line plots and zooms)? 

(5) Which sensors historically are prone to failure? 

(6) What records of failed PID's are maintained? 

(7) What I/O format is required to handle data at MSFC? 

(8) What analytical models of the SSME might be available? 

(9) Who at MSFC would be principle users of this system? 

(10) Who at MSFC would be interested in receiving status updates on the 

progress of the system? 

(1 1 ) Who at MSFC could ask questions regarding specific PID's on recent 
tests? 


General Comments 

All the NASA MSFC and Martin Marietta personnel interviewed were extremely 
helpful, interested, and generous with their time during the trip to Huntsville. Two 
types of meeting were conducted. First, general overview meetings were held with 
Darby Makel and Mark Gage of Aerojet, and Ron De Hoff of SCT, along with a given 
MSFC interviewee. In these meetings the program objectives and the relationships 
between Task 3 and Task 4 were explained. These meetings were then followed up 
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with one on one meetings between Darby Makel and the various SSME data 
analysts. 

Interview with David Vaughn: 

David Vaughn is the manager of the Martin Marietta Data Analysis Group. He was 
very supportive of the program objectives and stated that there is a "real need" for the 
sensor data validation code as soon as possible. He said he would be the contact 
person for providing information and data regarding the historical behavior of 
particular PID's and test firings. David described their current sensor data validation 
procedure as a "confirmation procedure," where the data analysts have sufficient 
experience and intuition that they can inspect other transducer readings to confirm a 
failrd sensor reading. While this procedure is qualitative it appears to be a good 
starting place for a rule based approach. While David did not provide specifics 
regarding system requirements, he emphasized the need to link the Sun into their 
overall data flow. 

Interview with David Foust: 

David Foust is the lead engineer in the Data Analysis Group, he reports to David 
Vaughn. David’s group has the primary responsibility for detecting sensor failures as 
part of their overall responsibility to review engine operation from test to test. The 
data analysis group examines a standard set of data plots for each test. If anomalies 
are detected other plots are requested after the initial review or previous test records 
are examined. Sensor failure detection depends on the analysts' manual review of 
the plots. In addition, not all of the CADS and Facility PIDs are plotted. Failed PIDs 
not plotted may never be detected (however, these PIDs are not very important for 
assessing engine operation). Once a failed PID is detected, a confirmation 
procedure is used to determine if the signal is the result of a sensor problem or an 
engine problem. If the anomalous signal occurs during main stage, the sensor 
reading during pretest and post test is examined to see if erratic behavior or scaling 
problems exist. In addition, other sensors which should be similarly affected by off 
nominal engine behavior are examined. A typical confirmation is to examine 
pressures and temperatures upstream and downstream of an anomalous sensor. 
The sensor reading will also be compared to previous test data. The previous test 
data must always be from the same test stand and preferably with the same engine 
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and the same turbopumps. If data with the same engine is not available, then data 
from an engine with as many of the same pumps as possible is used. The degree of 
variation in a sensor reading is judged based on a two sigma data base maintained 
by David Foust's group. A different 2 sigma data base exists for each of the three 
NASA SSC test stands. As a general rule an "typical" transducer will fail once in 
every 10 tests and small number of "problem" transducers fail on more than 50% of 
tests. David Foust viewed an automated sensor data validation procedure as a 
significant improvement to the efficiency of the current data review process. 


Interview with Brian Pierkarski: 

Brian Pierkarski is the lead engineer in the Model Analysis Group, he reports to David 
Vaughn. The Model Analysis group is responsible for calculating the performance for 
each test. This group uses the test data as an input to the steady state performance 
model. Brain's group is very sensitive to the issue of sensor failures and are very 
supportive of the data validation and signal reconstruction code. The calculation of 
specific impulse is very sensitive to slight errors in the model input PID’s. Slight 
sensor drifts which are within the 2 sigma band, and may not appear significant to the 
data analysts assess engine operation, can cause appreciable errors in the 
performance calculation. Signal reconstruction is of particular interest to the model 
group. Currently, if a PID needed for the steady state model is missing due to a 
sensor failure, an approximate average value is input by the operator. 

Interview with Marc Neely: 

Marc Neely is a NASA MSFC engineer and works in the Liquid Propulsion Branch. 
Marc was very interested in the sensor data validation program and offered to provide 
help as the program evolved. He reiterated much the same technical information as 
discussed above. In addition, he feels that an expert system approach is the most 
suitable based the current state of knowledge of the SSME and the lack of a good 
model which can yield data that accurately predicts sensor readings. He expressed 
the opinion that the it would be necessary to bring a beta-test version of the code to 
MSFC, with on-site support from Aerojet. This task would be needed to test out its 
operation and build confidence among the analysts and NASA management in the 
codes operation and reliability. 
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Interview with Bill Baker (Aerojet 

Bill Baker is in Data Services. Bill performs data analysis for about 15 
programs (including Titan) in the test area. 

* The primary function of Data Services is to provide 'Qualified Data' to 
Engineering. Thus, the detection and resolution of sensor anomalies is their 
responsibility (although, they frequently work together with Engineering in resolving 
problems which turn out to be sensor malfunctions). 

• Procedure: 

-Sensors are calibrated in the lab to specifications. 

-Sensors are installed on the stand and hooked up to electrical and recording 
equipment. 

-A pre-test electrical calibration is run. Each sensor is stepped through four 
voltage levels (25, 50, 75, and 100% of maximum nominal value) to simulate its 
output. The sensor’s response to these excitations are recorded and the absolute 
value, linearity, and return to zero are computed for each test. 

-Immediately following a test a post-test electrical calibration is run. 

-Approximately one hour after the test (when the engine has completely de- 
pressurized) another post-test electrical calibration is run. 

(All of the above data is available for diagnostic purposes.) 

For each sensed parameter the following values are computed for each steady- 
state summarized time slice for post-test analysis: 

-Standard deviation for each sensor. 

-Variance of each sensor from a family nominal value. 

-For duplex sensors: Difference and percent difference from each other 
(compared to historical difference). 

* No hard "redlines" are used to automatically discredit sensors (although Bill 
mentioned that a 3-sigma variance warranted investigation). 

• If two duplex sensors differ by too much and one is especially noisy, then you 
tend to discredit the noisy one. 
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There are two kinds of noise: 

-Random (positive and negative variance). 

-Spike (positive OR negative variance at periodic intervals); indicative of an 
electrical problem. 

If you suspect a sensor problem, check the values upstream and downstream 
from it. 

Bill always looks at computed performance data first. If there is a problem, 
then he starts looking at suspect sensors on an as-needed basis. 

If there is a serious sensor problem, will often re-calibrate the sensor in the 
metrology lab and apply a correction factor (derived from the re-calibration 
procedure) to the sensor data. 

Unless a problem is detected, Bill typical does not look at: 

-Facility sensors. 

-Post-test calibration data. 

-Transient data plots. 

Try to look at invariant relationships among sensors to diagnose problems 
(e.g., AP or computed resistance). Example: 
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Interview with Joe Berroteran fAeroietl 

• Joe is head of Aerojet's Instrumentation group in Design Engineering. He used 
to work for Rocketdyne and performed failure analysis on the SSME 
instrumentation. 

• Current SSME controller sensor validation procedure: 

-Each sensor has yellow lines, red lines, and reasonableness lines (piecewise 
models for all steady-state and transient conditions, see below), expressing upper 
and lower bounds for the sensor's values. A yellow line represents the normal 
operating range. Red lines demarcate abnormal (often unsafe) operating ranges. 
Reasonableness lines demarcate regions which are physically impossible for the 
sensor's value to fall within. 

-Duplex sensors (including most SSME sensors): If either sensor is above the 
red line but below the reasonableness line for two consecutive samples, then the 
engine is shut down (for certain critical sensors). Otherwise, if either sensor is 
over the reasonable line for two consecutive counts then that sensor is assumed to 
have failed and is ignored. 

• SSME test procedures do include a pre- and post- test calibration. 

There are five major classes of sensors on the SSME: temperature, pressure, 
flow, speed, and acceleration. 

Open circuits (both hard and intermittent, caused by a broken wire, e.g.) are 
probably the most common sensor failure. Typically an open circuit will cause a 
sensor's value to go to zero (or some very small constant "offset" value). 
Intermittent open circuits show up as instantaneous variations from normal to zero 
(or offset). 

Short circuits (e.g., due to contamination) are very rare (Joe has only seen one 
or two in over 1 0 years). Instrument circuitry is typically designed to shut down the 
sensor if a short occurs. 
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"Spikes" - sensor readings which change faster than their measured 
parameters could physically change - are indicative of sensor failures. 

Speed and flow sensors are subject to "aliasing" in which the sensor reading 
has a low frequency sine wave superimposed onto it. This is fundamentally due to 
a timing problem, but can be caused by mis-alignments of the sensors (e.g., Joe 
had an example of a flow meter which suffered from aliasing due to its four 
impellor blades not being at perfect right angles to each other). This is a very rare 
problem. 

Pressure sensors are subject to drift due to thermal effects. Drift typically 
shows up as a constantly increasing or decreasing bias. 

. Pressure sensors are also subject to "overshoot" of about 3-5psi. It can take 
around 1 5 seconds for them to settle to their correct value. 

- Temperature sensors can go out of calibration due to thermal expansion. This 
results in a constant bias ( 1% FS) in the open circuit direction. 

• Pressure sensors have a "reference vacuum" on the inner side of their 
diaphragm. These sensors can lose this vacuum, resulting in a constant bias (this 
should show up as a difference between pre- and post- test calibrations). 

• Sensor accuracy specs for SSME: 

-All sensor systems must be accurate to within 2% FS (including transmission, 
A/D conversion, etc.). 

-Pressure and temperatures transducers must be accurate to within 1/4% FS. 

-Flow and speed transducers must be accurate to within 1/2% FS (?). 

Interview with Bill Ferrell (Aerojet) 

• Bill Ferrell has 31 years' experience at Aerojet (including 5 years in the test 
area), mostly on the Titan program. He currently analyzes data from Titan flights 
and acceptance tests, particularly when an anomaly is discovered. 
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Bill confirmed that the primary way sensor malfunctions are currently detected 
and diagnosed on Titan is through analysis of the engine characteristics 
(resistances, performance biases, etc.). These characteristics should be relatively 
constant across the range of engine performance, and should have a predictable 
variance from engine to engine, both making anomaly detection easier. 


Bill showed me plots of three related characteristics from the most recent Titan 
acceptance test which clearly demonstrate the usefulness of using characteristics 
for anomaly detection and diagnosis (see the two pages attached to this report). 
These three parameters are: 

ROT - Total resistance from turbopump discharge to 

chamber. 


ROL - Resistance from turbopump discharge to injector. 
ROJ - Resistance across injector (also called ROTCA for 

"Chamber Assembly"). 


Resistance = (Pressure drop across line)*(Specific 

Gravity) / (Flow)^ 

In the most recent test, the values for ROL, ROJ, and ROT were as shown on 
the first graph. ROT stayed constant, but ROJ and ROJ varied. However, their 
variance was equal in magnitude and opposite in direction, indicating a problem 
with the POJ sensor. Note that if all resistances are high or low, then the sensors 
used for computing Specific Gravity or Flow should be suspect. 

Bill mentioned that accuracy of computed characteristics depends on the 
accuracy of the sensors involved and the formulas used in their derivation. For 
example, if the pressure drop between two pressure sensors is very small (e.g., 
across a pipe) then the computed resistance will have a high variance because a 
small change in either pressure will have a large effect on computed resistance. 

A second method for detecting sensor failures used by Ferrell is to look at the 
transient curves of characteristics (even though they don't change much, they still 
experience small transients on engine startup and shutdown). The second chart 
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attached shows ROJ for the last five Titan turbopump tests over time. The most 
recent test clearly shows a deviation from family norms. 

Another diagnostic method that Ferrell suggested (given that an engine has 
already been tested once to derive an initial set of characteristics) involves taking a 
few critical sensor values (at steady-state) and deriving what all other sensor 
values should be using the engine's characteristics as true. Deviations of sensors 
from these "reconstructed" values are then used for anomaly detection. 

Bill also mentioned that the pre-test and post-test calibrations for sensors 
should be compared (see the 1/17/91 interview notes for Bill Baker); any 
mismatches are indicative of malfunction. 

Two other heuristics that Bill mentioned were: 

-If a sensed value has an unusually high variance it typically indicates a sensor 
problem (Bill visually inspects transient plots and knows what normal variances 
"look" like, see the third chart attached to this write-up). 

-If a sensed value has an unusually high frequency it typically indicates a 
sensor problem, especially if the signal is changing faster than the process could 
physically change. 
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Appendix B : 


CHARACTERISTIC AND REGRESSION MODELS 
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SUBJECT: 


RL Bickford, DB Makel, Dept. 9842 file 


Two-sigma Variation of CADS Sensor Data as a Function of 
Engine Power Level 


Enclosures: 
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The attached encloses Lox inlet 

sources. Enclosure 1 contains typical fligh data tor cnamoc F The$e 
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correlated. 
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ri"ta r ar— & I S"ed be b y°.he .04 percent dam base m 

enclosure 2. . . 

LOX inlet 3 

ocairrin^at Tp^x^ately^^d^^seconds aher Uftoff can be respectively seen 
in the last two sensor traces in enclosure 1. 

The results of the two-sigma analysis are shown in enclosure 2. 

If I can be of any further assistant, be reached at 

^ 05 ) 5 S 4946 N ^ C mS < SpW but th^ demands on his time are high. 


jZ~y,C^ 

Douglas M. Matson . 
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ENCLOSURE 1 Typical Flight Data for Selected Parameters 


MCC Pressure 
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HPFTT Discharge Temperature 
HPOTT Discharge Temperature 
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ENCLOSURE^ CADS Flight Data as a Function of Engine Power Level 

Phi«» II Data Base (Issue Date 10-13*89) 

Ph 1 Summary of CADS sensor data encompassing 
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Commercially Available Pattern Matching 
and 

Neural Network Software 


Package Name Product Description 


1 . NeuroSym 
Neurocomputing 
Library 


Library of neural networks programmed 
in C language; requires C compiler; 
source code provided; 12 architectures 
included. 


2. Brain Maker 

V 2.3 

Brain Maker 
Professional 

V 2.0 

3. NeuroShell 


Stand-alone neural net development tool; 
Some C source code provided with 
Professional V 2.0; 8 architectures 
included. 


Stand-alone neural net development tool; 
Source code provided with run time option; 
2 architectures available. 


4. Professional II 
Plus 


Stand-alone neural net development tool; 
31 architectures supported. 


5. Explore Net Stand-alone neural net development tool; 
3000 requires AT with Microsoft Windows. 

21 architectures included; no source 
code available. 


6. pLOGIC 


Stand-alone statistical pattern-recognition 
software; requires 336-based PC with 
math coprocessor, plus 2 megabytes 
extended memory; source code available, 
but not included. 


Publisher 

NeuroSym Corporation 
P. O. Box 980683 
Houston, TX 77098-0683 
(713) 523-5777 

California Scientific 
Software 

10141 Evening Star Dr., #6 
Grass Valley, CA 95945 
(916) 477-7481 

Ward Systems Group, Inc. 
245 West Patrick Street 
Frederick, MD 21701 

Neural Ware, Inc. 

Penn Center West, 

Bldg. IV, Suite 
Pittsburg, PA 15276 
(412) 787-8222 

Hecht-Nielsen 
Neurocomputers 
5501 Oberlin Drive 
San Diego, CA 92121 
(619) 546-8877 

pLOGIC Knowledge 
Systems, Inc. 

23133 Hawthorne Blvd., 
3rd Floor 

Torrance, CA 90505 
(213) 378-3760 


7. NDS1000 
Version 1 .2 


8. MacBrain 


Stand-alone neural network pattern 
recognition tool; uses 1 proprietary archi- 
tecture; versions available for PC and Sun 
Workstations; source code not available. 
Can be imbedded in hardware. 

Stand-alone neural net development tool; 
requries Macintosh Plus or better; 12 archi- 
tectures and some source code (C and 
Pascal) included. 


Nestor, Inc. 

One Richmond Square 
Providence, Rl 02906 


Neurix 

327 A Street, 6th FI. 
Boston, MA 02102 
(617) 577-1202 


D-2 




NASA 

gsrgfsxr' 

Report Documentation Page 

1. Report No. 

CR 187124 

2. Government Accession No. 

3. Recipient’s Catalog No. 

K FU»rvr>rt Hate 


4. Title and Subtitle 

Sensor Data Validation and Reconstruction 
Phase 1: System Architecture Study 


7. Author(s) 

D.K. Makel, W.H. Flaspohler, T.W. Bickmore 


June 1991 


6. Performing Organization Code 


8. Performing Organization Report No. 


10. Work Unit No. 


9. Performing Organization Name and Address 

Aerojet Propulsion Division 
P. O. Box D222 

Sacramento, California 95813-6000 


1 1 . Contract or Grant No. NAS 3-25883 

NAS 3-25883 


12. Sponsoring Agency Name and Address 

National Aeronautics and Space Administration 
Lewis Research Center 
21000 Brookpark Road 
Cleveland, Ohio 44135 


13. Type of Report and Period Covered 

Task 3 Summary Report 
May 1990 - Feb 1991 


14. Sponsoring Agency Code 


15. Supplementary Notes 

This final report describes the work performed under Task Order 3 of the Development of 
Life Prediction Capabilities contract. 

The NASA technical monitor was Claudia Meyer. 


16. Abstract 

The sensor validation and data reconstruction task (1) reviewed reievant literature and 
selected applicable validation and reconstruction techniques for further study, (2) analyzed the 
selected techniques and emphasized those which could be used for both validation and 
reconstruction (3) analyzed SSME hot fire test data to determine statistical and physical 
relationships between various parameters, (4) developed statistical and empirical correlations 
between parameters to perform validation and reconstruction tasks, using a computer a de 
engineering (CAE) package, (5) conceptually designed an expert system based knowledge 
fusion toolf which allows the user to relate diverse types of information ^ when validating sensor 
data The host hardware for the system is intended to be a Sun SPARCstation, but could I be 
any RISC workstation with a UNIX operating system and a windowmg/graphics system such as 
Motif or Dataviews. The information fusion tool is intended to be developed using NEXPERT 
Object expert system shell, and the C programming language. 


17. Keywords (Suggested by Author(s)) 

Sensor Validation, Data Reconstruction, 
Space Shuttle Main Engine, Rocket Engine 
Diagnostics, Expert Systems, Bayesian Belief 
Networks 


18. Distribution Statement 


Unclassified - Unlimited 


19. Security Classif. (of this report) 

Unclassified 


20. Security Classif. (of this page) 

Unclassified 


21. No. of pages 

327 


22. Price 


NASA FORM 1«2S OCT 66 


D-3 






